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ABSTRACT 

We present optimal quadratic estimators for the Fourier analysis of cosmological sur¬ 
veys that detect several different types of tracers of large-scale structure. Our estima¬ 
tors can be used to simultaneously fit the matter power spectrum and the biases of 
the tracers — as well as redshift-space distortions (RSDs), non-Gaussianities (NGs), 
or any other effects that are manifested through differences between the clusterings 
of distinct species of tracers. Our estimators reduce to the one by Feldman, Kaiser 
& Peacock (ApJ 1994, FKP) in the case of a survey consisting of a single species 
of tracer. We show that the multi-tracer estimators are unbiased, and that their co- 
variance is given by the inverse of the multi-tracer Fisher matrix (Abramo, MNRAS 
2013; Abramo & Leonard, MNRAS 2013). When the biases, RSDs and NGs are fixed 
to their fiducial values, and one is only interested in measuring the underlying power 
spectrum, our estimators are projected into the estimator found by Percival, Verde & 
Peacock (MNRAS 2003). We have tested our estimators on simple (lognormal) sim¬ 
ulated galaxy maps, and we show that it performs as expected, being equivalent or 
superior to the FKP method in all cases we analyzed. Finally, we have shown how to 
extend the multi-tracer technique to include the 1-halo term of the power spectrum. 

Key words: cosmology: theory - large-scale structure of the Universe 


1 INTRODUCTION 

Astrophysical surveys have come to occupy a central role in 


cosmology ([York et al.[[2000 

Gole et al.[[2005 

Abbott et al. 

20051|Scoville et al. [20071 Adelman-McCarthy et al.|2008a|bl 

PAN-STARRSl 

BOSS[ Blake et al.|2011 1. Percent-level accu- 


racies can now be reached on distance measurements at high 
redshifts, both across and along the line-of-sight (Anderson 
|et al.|[^12[ |2014| . This means more and better constraints 
on the acceleration of the expansion rate of the Universe, on 
modified gravity ( Linder|2005 1, on non-Gaussian initial con¬ 
ditions ( [Verde et al.||2000[ [Brntolo et aL]|2004[ ), and about 
the role of massive neutrinos, among other applications. 

The groundbreaking achievements of the Sloan Digital 
Sky Survey ( [York et al.||200d| are being surpassed by other 
surveys with higher completeness, wider wavelength cover¬ 
age, and a larger range of redshifts {BigBOSS SUMIR^ El 
jlis et al.|2012||Abell et al.|2009p . Some surveys will specialize 
in mapping very large volumes to an extremely high com¬ 
pleteness, by employing imaging with narrow-band filters 
( [Benitez et al.|2009|[20T4) ), or by resorting to low-resolution 
integral-field spectroscopy ( Hill|20d8 (. In addition to galax¬ 
ies, quasars can also serve both as sources of background 
light to investigate the intervening matter through the Ly- 
a forest (Slosar et al. 20131, or directly as tracers of the 


large-scale structure ( 

Groom et al. 

2005 

da Angela et al. 

20051 Yahata et al.|20051|Shen et al.|2007 [Ross et al. [20091 

Sawangwit et al.||20121 |Abramo et al. 2012 Leistedt et al. 

2013 

Leistedt & Peiris 

2014 

1. Another way in which the 


3D matter distribution can be mapped is through the 21cm 
hyperfine transition of neutral H, and new radiotelescopes 
dedicated to measuring that line are being deployed or are 
in the planning stages ([Bandura et al.[[2014[ [Battye et al.[ 


This intense activity points to an exciting future, where 
vast volumes of the Universe will be increasingly mapped in 
a variety of ways, and with different types of tracers of the 
large-scale structure. 

However, these maps are not independent, since all trac¬ 
ers sit on the same distribution of dark matter. This points 
to a key obstacle on the way to explore the full power of these 
overlapping surveys: cosmic variance - a particular case of 
sample variance, where the sample is the set of modes of the 
density perturbations which were realized in some region of 
the Universe from a (nearly) Gaussian random process. 

Despite this fundamental limitation, it was pointed out 


by Seljak (20091; McDonald & Seljak (20081 that the bounds 


abramo@if. usp. br 


imposed by cosmic variance do not apply for many key phys¬ 
ical quantities of interest. In particular, by comparing the 
clustering of tracers with different biases one can measure 
some parameters to an accuracy which is basically uncon¬ 
strained by cosmic variance. This applies not only to bias 
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itself, but to the matter growth rate of the RSDs, to the 
NG parameters /nl and Qnl, as well as any parameters 
that affect the relative clusterings of different tracers. The 
consequences of this extraordinary windfall were explored 


in many papers - see, e.g., Slosar 

(2009); Gil-Marln et al. 

([20101; Cai & Bernstein (2011a|b 

); Hamaus et al. 

(2011 

1; 

Smith, Desjacques & Marian (2011); Hamaus et al. 

(2012 

1; 

Abramo & Leonard (2013 

I; Blake et al. (2013); Bull et al. 

(2014); Ferramacho et al. ( 

2014); Hamaus et al. (2014). It 

S 


important to stress that this additional information comes 
from measuring the two-point functions of the different trac¬ 
ers, as opposed to enhancing the signal-to-noise by employ- 


In 


Abramo & Leonard (20131 we showed that the en 


ing different statistics such as mass-weighting (Seljak et al. 
2009 Cai et al.|2011 Smith fc Marian|2014 1. 


hanced constraints of multi-tracer cosmological surveys are 
a straightforward consequence of the multi-tracer Fisher in¬ 
formation matrix ( Abramo||2012 1. In the presence of N dif¬ 
ferent types of tracers, each one with a different bias, there 
is a simple choice of variables which diagonalizes the multi¬ 
tracer Fisher matrix: in addition to the underlying power 
spectrum (which is subject to the cosmic variance bounds), 
there are N — 1 variables which correspond to the relative 
clustering strengths between the tracers. These relative clus¬ 
tering strengths are not affected by cosmic variance, and 
their measurements can be arbitrarily accurate, even if the 
survey has a finite volume. In the case of a single tracer, the 
multi-tracer Fisher matrix reduces to the usual case treated 


in the seminal paper by Feldman et al. (19941 (henceforth 
FKP). 

In this paper we use the multi-tracer Fisher matrix to 
derive optimal estimators for the redshift-space power spec¬ 
tra of an arbitrary number of different types of tracers of 
large-scale structure (see Sec.[^. These tracers may overlap 
in some regions but not others, or not at all. The tracers 
can be galaxies of different types, quasars, Ly-a absorption 
systems, etc. One may also choose to trade individual ob¬ 
jects by halos of different masses — in which case the bias of 


the tracers become the halo bias ( Seljak et al.|2009 [Hamaus 
et aH20'Toj Cai et al.|2011 Smith fc Marian|2014 |2015 (. 

Our formulas can be used in any of those situations, 
in real or in redshift-space, including effects such as scale- 
dependent bias. An important cross-check is that a particu¬ 
lar combination (or projection) of our estimators leads back 
to the estimator of Percival et al. (20031 (henceforth PVP), 
— namely, the PVP estimator follows from ours if the biases 
of the tracers, as well as the RSDs, are hxed to their true 
values, and one then computes the underlying matter power 
spectrum using the aggregated clustering information from 
all the tracers. 

We have also incorporated the contribution of the 1- 
halo term to the covariance of the galaxy counts (see Sec. 
into the multi-tracer Fisher matrix. In principle, the full co- 
variance can be computed using the Halo Model ( Cooray"^ 
Sheth|2002 I, together with appropriate halo occupation dis¬ 
tributions for the tracers. Recently, Smith & Marian (20151 


presented an optimal estimator for the power spectrum in¬ 
cluding all Halo Model corrections to the spectrum, bispec¬ 


trum and trispectrum. However, the Smith & Marian {20151 


estimator generalizes the estimator of Percival et al. (20031, 
while we have obtained estimators not only for the matter 
power spectrum, but also for the biases, RSDs, NGs, etc. 


Hence, we are only able to include the simplest correction 
to galaxy clustering from the halo model (the 1-halo term), 
but our framework allows the simultaneous estimation of 
bias, RSDs and the power spectrum, while [Smith fc Marian| 
(20151, include all the corrections that can be computed on 


the basis of the Halo Model, but their estimator applies only 
to the power spectrum, and assumes prior knowledge of the 
bias of all the species, as well as of the shape of the RSDs. 

The Fisher matrix and the covariance of the counts of 
the tracers are the basic objects used to construct our esti¬ 
mators - in fact, the optimal estimators are a type of Wiener 
filtering, in the sense that we are basically weighting the data 
by the inverse of their covariance. We employ results and 


notation introduced in Abramo (20121; Abramo & Leonard 


(20131, and the construction of the estimators follows the 
steps outlined in [Tegmark] ( |1997| ) ; [Tegmark et al.j ( |1998| ) . 

The estimators were tested using simple mock galaxy 
catalogs based on lognormal maps (Sec. [^. We find that 
the empirical covariance of the power spectra is well ap¬ 
proximated by the theoretical covariance (the inverse of the 
multi-tracer Fisher matrix), conhrming the optimality of the 
estimator. 

The main formulas of this paper are derived in Sec. 3, 
and a practical algorithm for the Fourier analysis of multi¬ 
tracer surveys is summarized in Sec. |5.2| 

This paper is organized as follows: in Sec. we review 
the Fisher information matrix for single-tracer and multi¬ 
tracer cosmological surveys. In Sec. we construct the op¬ 
timal quadratic estimators on the basis of the covariance 
matrix for the data. In Sec.j^we discuss the relationships be¬ 
tween the multi-tracer estimators and other methods, such 
as FKP and PVP, as well as the main features of the multi¬ 
tracer technique. In Sec. we test the estimators in sim¬ 
ple simulated maps, showing that the empirical covariance 
matches closely the theoretical covariance — which estab¬ 
lishes that the multi-tracer estimators are unbiased and op¬ 
timal. In Sec. |6| we show how to include the 1-halo term in 
the estimators of the multi-tracer power spectra. We also 
show there how to construct estimators for the 1-halo term, 
and how to generalize the procedure to estimate simultane¬ 
ously the 2-halo and the 1-halo term of the power spectrum. 
We conclude in Section |3 


2 THE INFORMATION IN GALAXY SURVEYS 

The matter power spectrum is dehned through the expec¬ 
tation value {Sm{k, z)SL{k', z)) = {2 -k)^P m{k, z)SD{k — k'), 
where Sm{k) is the matter density contrast, and Sn is the 
Dirac delta function. However, galaxy surveys actually mea¬ 
sure counts of tracers of the large-scale structure (galaxies 
and other extragalactic objects) in redshift space. It is from 
those observable that we can then measure derived quan¬ 


tities such as the baryon acoustic oscillations (Eisenstein 


et al.||T999| [Blake fc Glazebrook||^03| jSeo fc Eisenstein 
2003| |, or the pattern of redshift-space distortions (Kaiser 
19871 [Hamilton|2005a|b I. 


Eor a tracer of type a, whose counts as a function of 
(redshift-space) position are na{x), the density contrast is 
Sa{x) = na{x)/na{x) — 1. The mean number densities ha 
should reflect the spatial modulations of the observed num¬ 
bers of galaxies which are due to the instrument, the strat- 
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Fourier analysis of multi-tracer cosmological surveys 3 


egy and schedule of observations, as well as any other factors 
unrelated to the redshift-space cosmological fluctuations. 

If we assume that bias is linear and deterministic, 
then in the distant-observer approximation the redshift- 
space fluctuations in the counts of the a-type galaxies are 
related to the underlying mass fluctuations by the rela¬ 
tion 3a{k,z) ~ [&a + f z) . Here ba is the bias of 

the tracer species a, f{z) is the matter growth rate, and 
Vk = k ■ r is the cosine of the angle between the Fourier 
mode and the line of sight. 

The index a (we employ greek letters to denote different 
tracer species) can be any kind of discriminant of the types of 
tracers of large-scale structure: it may stand for luminosity, 
morphological type, star formation rate, equivalent width of 
some emission line, or a combination of those. One may also 
regard the dark matter halos themselves as the tracers, in 
which case a would stand for the halo mass (or some proxy 
for it, such as richness), and the bias ba becomes halo bias. 

There are some complicating factors in this description. 
First, structure formation should introduce a scale depen¬ 
dence for the bias, as well as some degree of stochasticity 


(Benson et al. 2000 Dekel & Lahav 1999 Weinberg 2002 


Smith, Scoccimarro & Sheth 20091. Second, the initial condi¬ 


tions may contain non-Gaussian features, which would man¬ 


ifest themselves as an additional scale-dependent bias (Bar- 


tolo et al. 2004 Sefusatti & Komatsu 2007 Dalai et al. 20081. 


Third, the velocity dispersion from random motions inside 
halos will smear the galaxy density contrast, affecting the 
shape of RSDs. In fact, the RSD parameters and angular 
dependence can inherit scale-dependent non-linear correc¬ 
tions ( Raccanelli et al.|2012 l 

Hence, in practice it is more useful to regard “bias” as a 
more general function of redshift, scale, and angle with the 
line-of-sight that should be determined by observations, and 
define the clustering of a species a as: 

( 1 ) 


where Ba = ba + fuk + is an effective bias. This effec¬ 

tive bias can include not only RSDs and non-Gaussianities 
(NGs), but also scale-dependence of the bias, or any other 
effect that distorts the power spectrum of the tracers relative 
to the underlying matter distribution. 

In principle, everything depends on x and k, but one 
can regard x (i.e., the radial position) as standing in for 
redshift, so we could also write Ba = Ba{z, k, fik), and Pm. = 
Pm{z,k). Since the matter power spectrum, bias, RSDs, as 
well as NGs and other corrections, are just subproducts of 
clustering measurements for all the available tracers in a 
given survey, the problem we must address is how one can 
optimally estimate the clusterings Pa {x, k) ■ Our approach 
also means that cross-correlations are expressed as Pajs = 
BaBpPm- 


2.1 Optimal estimators and the Fisher matrix 


The Fisher information matrix can be constructed from the 
data covariance after a series of simple steps - for a re¬ 
view in the specific context of cosmological surveys, see. 


e.g., Tegmark et al. (19971; Tegmark (19971; Tegmark et al. 
( 1998D . 

Let’s assume that the measured quantities (the data) 
are di, such that, for simplicity, their expectation values 


vanish: {di) = 0. The data covariance is then Ctj = 
Cov{di,dj) = (didj). From this data we would like to extract 
some set of parameters p^, whose likelihoods we assume to 
be approximately described by a multivariate Gaussian. Un¬ 
der these conditions, the Fisher information matrix is given 
by: 


Fni- 


F[PiL,P^] 


! log £ ^ 
dp^dp^ 


1 ^-1 dCjk ^-1 dCii 



■ 1^' 
dp^ dp,\ ’ 


( 2 ) 


where the second line follows from the assumption of near- 
Gaussianity of the likelihood near its maximum. 

Now suppose that estimators Pfj. can be constructed, 
such that their covariance Cov{p^,Pi,) = Ffffi. These esti¬ 
mators must then be optimal, in the sense that they saturate 
the Cramer-Rao bound: Cov{Pfj,,Pv) > Ffif. 

There are in fact such estimators, which can be con¬ 
structed after a few simple steps, and which employ the 
same basic objects that appear in the Fisher matrix. The 
first step is to create the quadratic form: 


9m — 'y FJ djdj , (3) 

ij 


where 




_ 1 ^-1 dCt 

- 2 ^ dp, 


-Cf 


(4) 


Here, serves to subtract any possible bias the estimators 
may have, such that we end up with unbiased estimators 
whose expectation values coincide with their fiducial values. 

Gaussianity of the data (a key underlying assumption) 
implies that the 4-point function {didjdkdf) = CijCki + 
CikCji+CiiCjk, and from it follows, after some algebra, that 
the covariance of the quadratic form above is Cov{q,j., qfi) = 
F 

■L flU ■ 

The final step is to define the optimal quadratic esti¬ 
mators in such a way that their covariance is the inverse of 
the Fisher matrix. Clearly, then, the estimators: 


PfJ. y ^ F,ia tfa ! 

a 


(5) 


satisfy that condition. Finally, with the definition: 

A„ = Y^E^fC^j-Y^FapPp (6) 

ij P 

hi p 

we obtain estimators which are also unbiased: their expecta¬ 
tion values are equal to the fiducial values of those parame¬ 
ters: (p^) ->- 


2.2 Single-species Fisher matrix 

The most basic sources of uncertainty in galaxy surveys 
are cosmic variance and shot noise. In cosmological surveys 
which target a single species of tracer, the optimal estimator 
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for the galaxy power spectrum was derived by FKP (Feld- 
man et al.|1994 1. The corresponding Fisher information ma¬ 


trix was derived by Tegmark (19971; Tegmark et al. (19981, 
who also showed that the FKP estimator follows from the 
construction presented in the previous Section. 

The FKP Fisher matrix for a survey of some galaxy 
type a can be written as: 


= J 


dPxdLk 9 log Pc j. 91ogP( 

J~ a 


{ 27 V) 


de^ 


dei 


(7) 


where the Fisher information density in phase space associ¬ 
ated with the tracer a is: 


J^a{x, k) 


1 ( Pc 

2 'vl + Pc 


2 


( 8 ) 


In the expression above we have defined a dimensionless 
“clustering strength” of the tracer a, which is just the power 
spectrum in units of (Poissonian) shot noise: 

Vc,(x,k) = na{x)Pa{x,k) . (9) 


In the limit of arbitrarily high clustering strength (1/ha —>■ 
0, Va —> oo), the Fisher information density saturates the 
limit Pa —>■ Hence, for a survey of a single species of 
tracer there is an upper limit to the information which can 
be extracted from a finite volume and from a finite range 
of Fourier modes. This is nothing but a restatement of the 
limits imposed by cosmic variance. 

At this point it is useful to recall how, in practice, one 
can extract limits on the amplitude of the power spectrum 
out of the Fisher matrix. In that case, the parameters of the 
Fisher matrix are the values of the matter power spectrum 
at given bandpowers (i.e., at some bins in Fourier space), 
obtained from a given survey volume. Consider, then, the 
parameters: 

^ Pc,i = {PAx,k))i (10) 



where Xi represents a bin in real space (e.g., a redshift slice 
Zi) with volume Vg., and ki represents a bin in Fourier 
space (i.e., a bandpower ki, and an angular bin Uk.i) of 
volume V^_. In that case we should compute the Jacobian 

dPa{x, k)/dPa,i inside Eq. Q. It is useful to regard such an 
object in terms of functional derivatives Q Using: 

y!, = (2^)^fo(T- S')5n{k - k') , (11) 

df{x',k’) 


one can easily derive that the inverse of the Jacobian is: 


dPa,t _ 1 

dPAx,k) ~ 


( 12 ) 


Therefore, the Jacobian dPa{x, k)/dPa,i has the effect of 
limiting integrations in phase space, J d^xd^k/{27T)^[- • ■ ]i to 


^ In fact, all partial derivatives used in connection with the Fisher 
matrix should be replaced by functional derivatives in the con¬ 
tinuum limit. It is only when we use bins (in real space and/or 
Fourier space) that these functional derivatives are converted to 
partial derivatives. Nevertheless, in order to keep the notation as 
simple as possible, we employ the same notation for both. 


the phase space volume of the bin i. Since this type of object 
will reappear later on, we employ the notation <51 ~ to express 
the restriction of a phase space integral to a certain volume 
Vi, and we use the same notation to indicate restrictions in 
integrals over position space, 5g, or Fourier space, Jj. Hence, 
according to this notation: 


d^xd^kj ^ f d^xd^kj 


(13) 


Moreover, for non-overlapping bins i and j it follows that: 


/ 


d^x d^k 
(27r)3 


[• • ■ 1 X Ji j: X (5/ . = Sr 

L J x,k - 


L 


d^x d^k J 
(27r)3 * 


(14) 


With these identities in mind, it is trivial to see that 
when using Pa,i as parameters, Eq. 0 reduces to: 


Fa,ij = F[Pa,i, Pa,j] 


Sij f d^xd^k ^ 

Pli Jv. (2^)^ 


(15) 


A more familiar form for this equation follows if we revert to 
the definition of averages over real- and Fourier-space bins: 

5ij 14 /■ , 

Ea,., = / d\ {FAu ■ (16) 

JV,. 


Up to a factor of 2, the integral over position space in the 
equation above defines the usual effective volume | |Tegmarl^ 
1997 Tegmark et al.|[l998l. The uncertainty in the ampli¬ 


tude of the power spectrum at the bin i is therefore given by 
the covariance Cov[Pa,i, Pa,j] = F fAi whic h is diagonal in 
the Fourier modes — see, however, Abramo (20121. The rel¬ 
ative uncertainty in the bandpowers of the power spectrum 
is then given by the well-known expression: 






(17) 


where Vi — Vg. V/. is the phase space volume of the bin 
i, and (• • • )* denotes an average over the phase space bin. 
When the number density of the tracer is very high, Va 1 
and Fa —>■ 1/2, and if that is the case, then the survey is 
dominated by cosmic variance, ap^ itPa,i \/2/V). The 
phase space volume gives the number of modes of the bin 
ki that fit in the physical volume Vg^, and the factor of 2 
comes from the fact that the density contrast is real. 


2.3 Multi-tracer Fisher matrix 

Galaxy surveys can detect a wide variety of objects: galaxies 
of different types, quasars, Ly-a emmitters, Ly-Q absorbers, 
etc. In the future all this data will coalesce into multi-layer 
maps of the observable Universe, containting many different 
kinds of objects which can be regarded as tracers of the 
large-scale structure. 

The multi-tracer Fisher information matrix describes 
how the contributions of cosmic variance and shot noise af¬ 
fect the signal-to-noise ratio (SNR) of the observables we 
are trying to measure - namely, the clustering properties of 
the tracers. While the nature of shot noise remains basically 
the same in the presence of multiple tracers, the effect of 
cosmic variance, which is shared among all tracers, mixes 
the different components. 

The first authors to write a multi-tracer Fisher matrix 
(or, equivalently, a covariance matrix for the power spectra) 
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were White et al. (20081, 

McDonald & Seljak (20081 and case leads immediately to 

Hamaus et al. (|2012|. In 

Abramo (20121, we derived the 


multi-tracer Fisher directly from the covariance of the counts 
of the tracers (the “pixel covariance”). The basic difference 


between the approaches of White et al. (20081; McDonald & 


Seljak (20081 and ours is that those authors regard the cross¬ 


power spectra as independent parameters, while we implic¬ 
itly assume that, for the purposes of estimating the power 
spectra from the data, the cross-spectra are determined by 
the auto-spectra - see, however, Swanson et al.|(2008l and 


Bonoli & Pen (20081 for situations where this may not be 


true. The Fisher matrix computed in Eq. (21) of Hamaus 
et al. (20121 also reduces to ours, if the cross-correlations 


are unaffected by shot noise - see also Smith (20091; Smith 


& Marian (2014 20151. 


We now show how to obtain the multi-tracer Fisher 
matrix from first principles. The generalization of Eq. 0 
in the present context is: 


.?/ 


j3 73 / j3 n i3 /// 
a X a X a X ax 


a/3'y(7 

-1-1 


, , dCi3y{x',x") 


c-ft. 


dP,. 


Let’s express the covariance of tracer counts as: 
Cap{x,x') = iafs{x,x') + - x') 

^ ^ ik-(x — x') 

{2'k)^ 

Bo,{x, k)Pm{x, k)Bp{x', k)-\- f 


(18) 


(19) 


/ 




a[x) 


where x denotes the mean position (or redshift) in which the 
matter power spectrum is evaluated. A key difficulty with 
the covariance of counts is that, in any realistic situation, 
it cannot be inverted. However, if the effective biases and 
the power spectrum depend weakly on k, then it is a fair 
approximation to integrate the complex exponential in Eq. 
(191 into a Dirac delta-function, and to pull the rest of the 
integrand outside of the integral (Hamilton 2005a|b |. This 
imply taking the approximation that: 


Ca/ 3 (x,x') ^ Sd(x — x') X 


^ -f Be PmB/3 

Tla 


( 20 ) 


This expression can now be easily inverted, as we will show 
next. 

The inverse of the covariance should obey the property: 


f d^x'C^^{x,x')C,3^{x',x'') = 

P 

Y. J d^x'Cap{x,x')Cpf{x',x'') 


( 21 ) 


— S^X X ) . 


Since f dP'x'Soix — x')Sd{x' — x") = Sd{x — x"), all we 
have to do is to invert the matrix inside the square brackets 
in Eq. ( |20[ ). But matrices of the type Map = Sap + VaVp 
can be easily inverted, in fact M~f = Sap — VaVp/{l + v’^), 
where ^ simple generalization of this simple 


Capix,x') ^ Sd{x - x') X dapria - 

( 22 ) 

where we define the total clustering strength as the sum of 
all clustering strengths: 

V{x, k) (*) k) Pm{x, k) = Y ■ (23) 


The problem with Eqs. (201 and (221 is that they refer 


to a scale k which does not exist in the original expression, 


Eq. (19l. In fact, Eqs. (20h(22l treat the positions of the 


two-point function, x and x' as one and the same, due to the 
Dirac delta-function. Hence, one shou l d th ink of the Fourier 
mode k which is implicit in Eqs. (20l-(22l, as the reciprocal 


of some typical physical distance between x and x', and in 
that sense, its role is to limit the scope of that distance in 
expressions involving these approximations. Notice that this 
issue appears already in the FKP and PVP methods, and we 
do not present any new development regarding this point. 

Coming back to Eq. ( |18[ ), we see that the last step be¬ 
fore constructing the Fisher matrix is the computation of 
the term dCap(x,x') / dP^p. Once again, it is useful to em¬ 
ploy the notion of functional derivatives and the results of 


we find, after some 


the previous Section. Using the second line of Eq. (191 and 
the fact that dPa{x,k)/dPfj,p = St-, 
rearrangement, that: 

j3 I 


d Cap{x ,x') 
dPa 


/ 


d-'k 


^ik- {x — x 


(27r)3 
Ba{x, k)Bp{x', k) 
2Bl{xi,ki) 


[5a,SY + Sp,5^, 


(24) 


Notice that this object should be regarded as an operator: 
when it acts on functions of k it causes an integration over 
Fourier space, which is restricted to the volume of the bin ki 
by the presence of the 51 and St, Apart from a volume 
factor Vj this integration is nothing but an average over the 
Fourier bin ki. 

We can now obtain the Fisher matrix by substituting 
Eqs. \22\ and \2A\ into Eq. ( |18[ ). The result, after a bit of 
algebra, is that: 


Sjj f 

j,,iPv,i Jy 


d^x d^k 
(27r)3 ’ 


(25) 


where: 




Eq. (261 is in fact the Fisher information density per unit of 
phase space volume for logP^ ( Abramo|2012 1: 


F[^ogV^{x, k), logV^lx',k')\ 

— {2-k)^Sd{x — x')SD{k — k')Pii.v{x,F) , (27) 
or, equivalently, in bins of finite volume: 

F’[logP^;i, logP^ij ] = SijPi„v{xi,ki) . (28) 

One can easily check that the multi-tracer Fisher matrix of 
Eq. (251 reduces to the FKP Fisher matrix, Eq. (161, when 


there is only one type of tracer. 
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6 L. R. Abramo & Lucas F. Secco & Arthur Loureiro 


3 THE OPTIMAL MULTI-TRACER 
QUADRATIC ESTIMATORS 


where: 


Starting from the Fisher matrix of Eq. (251, and with the 
help of Eqs. (221 and (241, we are in a position to imple¬ 


ment the construction of the estimators which was presented 
in Section 2.1. For now we will not discuss the role of ran¬ 
dom maps, which help to subtract spurious fluctuations that 
could be generated by modulations on the mean number of 
tracers, n^. The calculations below are exactly the same 
with or without the random maps, so we come back to this 
issue at the end of this Section, after we have shown how to 
construct the multi-tracer estimators. 

Since our data are the density contrasts of the tracers, 
the quadratic form of Eq. becomes: 

= f (Rx(Rx' E'^'p{x,x' ■,Xi,ki)5a{x)5i3{x')-5Q^,i , 

aP *' 

(29) 

where 5(3^,i ensures that the estimators are unbiased, and, 
according to the appropriate generalization of Eq. (j^ : 


/ d^yd^y'Cal{x,y) 

CT'Y 

(30) 

Inserting Eqs. \22\ and \2A\ into Eq. ( |30[ ), and then back on 
Eq. (291, leads to the following expression for the quadratic 

1 \ ' /* ' f d k ik-{x — x') 

C 5(Sr' 

xfa{x,k) (Saf,Slj: + f~,{x',k) -SQ^p , 


form: 

Qfx,i ~ 




where 


fa(x,k) = '^2,'W<ya{x,k)5a {x) 


(32) 


are “weighted density contrasts” for the tracers. The weights 
are: 


Waa{x, k) = 


daa 


Va{x, k) 

1 -I- V{x, k) 


naBa{x,k) . (33) 


These weights are the generalization of the FKP weights 
( [Feldman et al.||1994p for the case of multiple tracers of 
large-scale structure. As we will prove in a moment, Eq. 

(331 defines the optimal weights for maps containing an ar¬ 
bitrary number of different types of tracers. In the case of a 
single species of tracer, the weights for the density contrast 
reduce to u; = nB/{l + nB^Pm), which is precisely the FKP 
weight for the density contrast - except for a normalization, 
whose origin and purpose will become clearer soon. 


Returning to Eq. (311, notice that the Kronecker delta 


functions are accompanied by their respective restrictions 
over the phase space volume of the bin where we are es¬ 
timating the quantities of interest (in our case, the Pfi,i), 
hence: 


Q{i,i 


Fk 


(2^)" 

X f 

+ C. C. — , 


(x) J (R X < 


'/(^') (34) 




( 35 ) 


Hence, the spatial integral over covers only the bin vol¬ 
ume Vg., while the spatial integral over / should be per¬ 
formed over the whole volume of the survey. Although this 
is a subtlety which is present already in the Fourier analys 
a la FKP, in practice we are always considering data on fi¬ 
nite volume bins, and all integrations are performed inside 
each one of those bins. Nevertheless, a more rigorous treat¬ 
ment would dictate that one of the Fourier integrations be 
performed over the whole available volume of the survey, 
while the other would be carried out over the volume of the 
particular bin under consideration. 

In what follows we will ignore this subtlety, and will 
consider that both integrations over spatial volume result in 
the Fourier transforms and /. We then obtain that: 

j3 I 


Q.. = [Uk)r{k)+ C.C.] - SQ, 


(36) 


But the integration above is, up to the volume factor, simply 
the average over the Fourier bin, hence we have: 


Qfi,i — 




/m/* + c.c.)_ - . 

ki 


(37) 


Notice that, in this expression and others like it, the fac¬ 
tor of R”; (which here plays the role of a normalization) is 
the fiducial value of the effective bias, whereas the weighted 
density contrasts must be computed directly from the 


data. However, since the weights of Eq. (331 are themselves 
also computed using the fiducial values of and Pm, the 
weighted helds are a combination of both theory and 
data. The situation is not different from the usual case of 
Fourier analysis of cosmological surveys employing the FKP 
or the PVP estimators. Evidently, these quadratic estima¬ 
tors are only truly optimal if the parameters take their fidu¬ 
cial values. 

Starting either from Eq. (371, or more directly from 
Eq. ( |29| ), a long but straightforward calculation shows that 
the covariance of this quadratic form in fact results in 
Cov{C^fx,i, Qi',j 
given in eq. (251. 


= where the Fisher matrix was 


Finally we can construct the optimal quadratic estima¬ 
tors for the power spectra of any tracer species, by plugging 
the quadratic form above into the appropriate generalization 
of Eq. §. The Fisher matrix that is relevant in this case 
was already given in Eq. ( |25| ). We have, therefore, that the 
optimal quadratic estimators, whose covariances are given 
by the inverse of the Fisher matrix, are given by: 


Bv,i — Qi',j 


(38) 




L/,i\ Qi/,i 5 


where the second line follows from the fact that the Fisher 
matrix is diagonal in the phase space bin^. 

Now the origin of the normalizations of the weights, 

^ This is only true if the Fourier-space bins are sufficiently large, 
such that the spacing between them is larger than the reciprocal 
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Fourier analysis of multi-tracer cosmological surveys 7 


which appear both in the FKP and the PVP formulas, be¬ 


comes clear: up to the prefactor in Eq. (371, those normal 


izations correspond to the inverse of the Fisher matrix in 


Eq. (38l. 


3.1 Subtracting the bias of the estimators 

Although by definition Con(P^,i,we 
still must ensure that the estimators are unbiased. According 
to Eq. 1^, those biases are: 

dCci3{x,x') 

2^.1 “ ® dP, 


SQ 


lEf 

-EE^. 


(39) 


P 


This expression can be easily worked out, and the result is: 

(fix fik 

fVi 


^Q^,i = 


BL L 


2BlJy^ (2^)3 1 + P 


■ (40) 


It is also useful to compute the bias corrections for the 
power spectrum estimators, For this calculation we will 
employ the approximation that averages over the bins can 
be manipulated In such a way that {AB)i ~ {A}i {B)i. This 
amounts to assuming that the bins are small compared with 
the coherence scale of the quantities of interest. 

In order to go from to P^^i we must first find the 


inverse of the Fisher matrix which was found in Eq. (251. 


But that is basically the inverse of the Eisher matrix for the 
logP^ which was found in Eq. (26l. This is a particular case 


of the same type of matrix inversion which we used in the 
case of the pixel covariance, and the result is that: 


p-i _ p p ^-1 


where: 


4(1+P) 

flU 


P-1 


V VPm 2P 


(41) 


(42) 


Using this expression, and the approximation that bin aver¬ 
ages can be freely rearranged, we obtain that the estimators 
of the power spectra of the tracers reduce to: 




-<5Pu 


(43) 


In fact, one can also show directly from Eq. (401 that the 
bias of the estimators are: 


5P^,, = 


2 ; u,z] 


V 


(44) 


which implies that —>■ P/j.ii a-s it should. In the case of 

a single type of tracer, the bias of the estimator reduces to 
the (Poissonian) shot noise, 1/n. 

3.2 The window functions 

The expectation values of the power spectra obtained 
through the multi-tracer quadratic estimators are convolu¬ 
tions of the true power spectra with some window functions. 


of the typical size of the position-space bin, Afci > Tr/U. 
e.g., Abramo ( 2012^. 


1/3 


These window functions can be obtained directly from the 


expectation value of the expression in Eq. (311, by taking 
5a —t Ba 5m and neglecting the biases of the estimators: 

f (45) 

X'Wfj,a{x, k)Ba{x, k) k)B^{x\ k) 

m {x'))+c.c. . 




^ Y f 


Erom the definition of the weight functions, Eq. (331, it 
is easy to derive that ~ + P) = 

Pa/Pm(l + P), and expressing the matter 2-point corre¬ 
lation function in terms of the matter power spectrum, we 


obtain: 


= 

1 f FxXk f 

^BlJy^ (2^)3 J 

xe‘'^-*')-®G^(T,fe)e 


where: 



Pm{k') (46) 


Px' (Pk' 

(27r)3 

-'(*^-^')'"'G(T',fc)-bc.c. , 


Gfj,{x, k) = 


Pu 


Pm 1 -|- P 


(47) 


and G = = P/Pm(l + P). 


Once again, one of the real-space integrals in Eq. (461 


ought to be carried out only over the volume of the spatial 
bin, I/g., while the other should be in principle carried out 
over the whole volume of the survey (e.g., all redshift slices). 
In practice, it may be more conservative to treat each bin in 
position space as an entirely independent survey, and in that 
case the two integrals over the real volume would be carried 
out only on the volume of the bin i. In fact, it is only in this 
limit that the Fisher matrix of Eq. ( [^ , or that of Eq. ( |^ , 
are truly diagonal in the bins i and j ( [Abramo 20121, and 
therefore it is only in this sense that the optimal estimators 


satisfy the constraint that Cov{P^ 

Hence, we define the window function: 


- Fi2,j) ■ 


[F^ 




^G^{k,k')G*{k,k') + c.c. , 

(48) 


where the Fourier transform of the kernels of Eq. ( |47| ) are: 
G^{k,k') = f Gf_i,{x,k) , (49) 

JVi 

G{k,k') = ^G^(fc,fc')- (50) 


Because the integral over drk is performed only over 
the Fourier bin Vj., it is often an accurate approximation to 

take k ^ ki in the argument of the kernels of Eqs. (49 h( 501, 
and replace: 


G^(fc,fc') 


G{k,k') 


Gf,4k-k') (51) 

f (f xe^^~^ '‘''^Gfi,{x,ki) , 

Jvi 

G^{k-k') ^Y,^f^Xk-k') . (52) 


Notice that the Fourier transform of the kernels with respect 
to their spatial dependence still remains, since we do not 
replace k by ki in the exponentials. 

With these definitions, we can write the effective win- 
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8 L. R. Abramo & Lucas F. Secco & Arthur Loureiro 


dow function of the quadratic form as: 


W, 


(Q) 


4 

Ve 




X, Xy 

{G^,i{k-k')G:{k-k')^^ +C.C. . 


Hence, in terms of the window function we have: 

{Q^A-1-0^Prnik')W^9\R,k'). (54) 

An interesting limiting case happens when we take all quan¬ 
tities to be constant inside the spatial bin, 'Pij,{x,k) —>■ 
'PfiRk), and then take the continuum limit. In that case 
the kernels in Fourier space become Dirac-delta functions, 
Gy, —>■ Gyy (27r)®5_D(fc — k'), and the window function be¬ 
comes: 


density contrasts in a way similar to the definition of Eq. 
(32|: 


fAx,k) 


^WyAx,k) 

u 

'^WyAx,k) 


(59) 


5i,{x) — — Sl(x) -I- 1 — — 

Otu OLu 


where the weights were given in Eq. ( |33[ ). The values of 
Au should be calibrated in such a way that the weighted 
helds fy have zero mean over the volume of the sample, thus 
ensuring the so-called integral constraints, {Py{k = 0)) —>■ 0. 
It is easy to check that the condition f Fx fy — 0 is satisfied 
by setting: 


= ’ (60) 
U 


w, 


(Q) 


'Pa.i'Pi 




X (27r)^5_D(fci — k') . (55) 


The most relevant window functions are, of course, not 
, but those which apply for the estimators of the power 
spectra. Since {Py,i) = obtain that: 


(h,.)./ 


Pk' 

(^ 


P^ik')Wy,i 


(56) 


where: 


(57) 

Finally, in the same limit that was used to obtain Eq. (551, 
we can apply the identity: 


7?“i V A _ \ _ — 2B^ P 

Bl Prr, I+ V ^ " 


l + P 
V 


(58) 


to show that Wyy —^ x {2'KA&D{ki — k'), as in fact it 

ought to be. This completes the demonstration that the esti¬ 
mators derived in this Section satisfy all the desired criteria 
for optimal, unbiased estimators, with the correct continuum 
limits. 


3.3 Random maps and the integral constraints 

Up to now we have introduced the optimal multi-tracer 
quadratic estimators without mentioning the role of the ran¬ 
dom (“synthethic”) maps. They help subtract the fluctua¬ 
tions that arise purely as a result of modulations in the 
mean number density of the tracers, ny(x), and are caused 
by, e.g., angular- or redshift-dependent variations in the se¬ 
lection function of a survey ( [Feldman et al.|1994p . 

For each tracer species with mean number density fiy (x) 
we define a random (white noise) map with a mean number 
density with the same shape as that which is presumed for 
the data: fPyix) = ny{x)/oy, where ay are (small) constants. 
The random datasets have no structure, in the sense that 
their pixel covariances are just given by the shot noises of 
each sample: 

{Sy{x)5l{x')) = ^ 5d{x- x') = ay^F^ 5d{x- x') . 

riy Uy 

With the data and random sets we construct weighted 


where: 


Dy = 

J Lx'Yiriiyy{x,k) [1-|-(5ct(T)] 

Ryu — 

— f Lx Wyy{x,k) [l + 5l{x)] 

J 


(61) 

(62) 


Since Dv and Ryu are functions of k, in principle the con¬ 
stants Ay also depend on the wavenumber. In practice, we 
employ only a couple of putative values for Pm in all the 
weights, hence we compute Ay only for those values. 

Usually the mean density contrasts of the random cat¬ 
alogs are very close to zero, which means that Ay —^ Oy to 
a very good approximation. Indeed, taking —>■ 0 in Eq. 


(621 it follows that Eq. (601 can be recast as: 


Ay 


1 + E 


(Lx Wy 


,ix) 


j (Lx Wyy{x)5y{x) . 


The fractional difference between Ay and ay 


(63) 

is of the order 

of the average of the density contrast over the whole vol¬ 
ume of the catalog. This correction is negligible unless the 
galaxy catalogs are extremely sparse, hence it is often safe 
to take Ay —>■ ay. One can also improve this approxima¬ 
tion by taking smaller values of ay , which makes Eq. (631 
more accurate. However, if there are reasons (e.g., computa¬ 
tional) to limit the size of the synthetic catalogs, such that 
ay cannot be too small, then Ay may deviate from ay. 


Using Eq. (591 instead of Eq. (321 in the estimators do 


not make much difference in our calculations, except for the 
biases of the estimators, which inherit the factors of Ay and 


ay. Starting from Eq. (401 we obtain: 


5Qy,i = 


LxLk nyBy 


‘2BlJy^ (2^)3 (1+P)2 


(64) 


1 -f ^ PP) ~ Pu] r + ^Qy,i , 


where the extra term, AQyy, arises when Ay ^ ay, leading 
to the additional correction: 


= 


f AxY 
^BlJ Y 


A, 


- 1 w. 



(65) 
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Fourier analysis of multi-tracer cosmological surveys 9 


This expression can be simplified with the help of the def¬ 
initions and F = leading 

to: 


^Qn,i = 


1 

2^ 


[ cF X 

[f rp,] 

J 

L i + pj 


1-f-P 


( 66 ) 


As we discussed above, in most cases the tracers are 
sufficiently abundant to make ~ so F^ —>■ 0, and the 


extra term of Eq. (661 can be neglected. The biases of the 


estimators are then given only by the first term of Eq. (641 
— with the simplification that —>■ If, in addition, 

we assume that the random maps are constructed such that 


the are all identical, 
estimators become simply: 

1 -I- q: 


a, then the biases of the 


<5(3^4 = 




L 


drxdrk 

(27r)3 {1 + VF 


(67) 


Here, instead, we built the optimal estimators directly 
on the basis of the pixel covariance, assuming Gaussianity of 
the data. We already showed that our estimators reduce to 
that of FKP in the case of a single species of tracer. Now we 
show that the PVP estimator is just one of many possible 
projections of the mnlti-tracer estimators. 

If we fix the effective biases to their fiducial values 
(i.e., if the bias of each type of galaxy and the shape of 
the RSDs are set to their true values), then the remaining 
unknown is the matter power spectrum at the position- and 
Fourier-space bins, Pm,i- We may now ask what is the Fisher 
matrix for the matter power spectrum. This is easily derived 


from Eq. (25 ) through the change of variable: 

_ dPij,,k p 


F{Pm,i, Pmj) 


u, dPm,i 

flu kl 


dP„^ 


(69) 


4 PROPERTIES AND RELATIONS OF THE 
MULTI-TRACER ESTIMATOR 

The results of the previous Section are closely related to 
other methods for the Fourier analysis of cosmological sur¬ 
veys, but they also extend their scope considerably. 

The simplest limit is when we take all tracers to be a 
single species. In that case our formulas reduce to the ones 


by FKP. The weights of Eq. (331 reduce to w = nB/ {1 + P), 


which are the FKP weights after we make the identifi¬ 
cation ■p — ^ x-, _ d 2 




nB^Pm, where n = 


and 


B^ = n n^B^. Furthermore, the multi-tracer Fisher 


matrix of Eq. (261 also reduces to the FKP Fisher matrix 


once we sum over all the clustering strengts — i.e., when we 
combine all tracers into a single type. Changing variables 
in the Fisher matrix = FllogP^, logP,/], from logP^j 
to logP, introduces a constant Jacobian, This 

can be seen by considering the inverse Jacobian, J~^ = 
dlogV/dlogV 11 , = ViilV, which satisfies = 

'yZfiJZ^ ~ — 1. Hence the multi-tracer Fisher 

matrix projected into the Fisher matrix for the total clus¬ 
tering strength becomes: 


F[logP] = = 

pLV 


( 68 ) 


r 


i + r 


which is the FKP Fisher information density per unit of 
phase space volume. 

We now discuss some of the main features of the multi¬ 
tracer technique, as well as its relations to other methods in 
the literature. 


p2 


2 / 


6ij f (Fx(Fk ( F Y 

PfF J ( 2^)3 2 [l + Vj 


m,i = BfiSki- Hence, the 


where we used that dPfi^k/dP? 

Fisher matrix for the matter power spectrum is simply a 
projection of the multi-tracer Fisher matrix, where we sum 
the Fisher information over all the tracers. Naturally, this 


result is also identical to what was found in Eq. (161 in the 


case of a single tracer — i.e., in that case the PVP estimator 
reduces to the FKP estimator. 

Now, if one fixes the effective biases and wishes to es¬ 
timate the matter power spectrum alone, then the gener¬ 
alization of Eqs. (291 and ( |30[ ) follow simply by replacing 
the functional derivative d jdP^^i —>■ d /dPm,i, which is also 
equivalent to taking d jdPfi^i —>■ "yZu 9 /dPfi^. The re¬ 


sulting quadratic form is basically a projection of Eq. (37l: 


Q. 


{PVP) 


= 'Z (l/T 


(70) 


where the weighted field / was defined in Eq. (35l. There¬ 


fore, in the PVP estimator the density contrasts of all tracers 
are combined into a single weighted density contrast, at each 
point in space. The cross-correlations are all averaged out, 
in such a way that only the signal-to-noise of the matter 
power spectrum is optimized. 

The optimal estimator for the matter power spectrum 
is then simply obtained by multiplying the quadratic form 
by the inverse of the Fisher matrix, i.e.: 

1 


'l(PVP) 


Ni 


I/I' 


(71) 


4.1 The PVP estimator 

Suppose we fix all parameters R^,i, and try to estimate the 
matter power spectrum Pm{k) using data from all tracers. 
The optimal, unbiased estimator in that case was derived 
by Percival et al. ( 2003| ) (PVP) — ee also [Smith fc Mar- 


(20151. The method used by PVP to construct their 
estimator was the same as that used by FKP — i.e., the 
weights which minimize the covariance Cov{Pm,i, Pmj) were 
obtained through a variational approach. 


where the normalization is basically given by Eq. (691: 

Ni = I 1 . ( 72 ) 


Vr 


Em,i ivi 


<Fx (Fk 

(27r)3 


V 

1 + V 


Noting that V/Pm = E,j ^fiB^, we see that this estimator 
is precisely that of PVP. 

One may ask also the converse question: what if we want 
to fix the matter power spectrum Pm, and estimate the ef¬ 
fective biases BY In that case, it is a simple exercise to show 
that this would lead right back to the optimal multi-tracer 
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estimators, with the only difference that we would end up 
measuring Pij,^i/Pm,i- However, in reality we can only mea¬ 
sure the overall clustering of certain tracers of large-scale 
structure, which means to estimate the combined product of 
the matter power spectrum and the (square of the) effective 
bias. Any distinction between what belongs to the matter 
power spectrum, and what belongs to the bias, RSDs, NGs, 
etc., can only be made after some other type of prior knowl¬ 
edge is introduced - e.g., by constraining the normalization 
and shape of the spectrum from CMB observations, by mod¬ 
elling the RSDs, or by introducing priors on the bias from 
gravitational leasing. Evidently, it would be an overuse of 
information to fix the power spectrum in order to measure 
the bias, and then employ that bias in order to estimate the 
power spectrum. Both are measured together in galaxy sur¬ 
veys, and this fundamental degeneracy can only be broken 
by introducing additional data into the problem. 


4.2 The role of cross-correlations 


Although our estimators only compute the power spectra 
of the individual tracers, = B^Pm, it is clear from Eqs. 


(311-( 341 that the cross-correlations of the data, (SaSp) (with 
a 7 ^ /3), are also taken into account. In fact, the multi-tracer 
estimators express the optimal way to combine both the 
auto- and the cross-correlations in the computation of the 
physical parameters and Pm- 

Depending on the total signal-to-noise ratio (SNR), the 
power spectra of different tracers can have a positive or 
negative covariance. Since the SNR of a tracer is given by 
the amplitude of the power spectrum divided by shot noise, 
P/j = P(i/(h^)“^, the total SNR of a survey is expressed by 
the sum of the SNRs, P = Hence, when P 1 the 

total SNR Is high, and conversely, the total SNR is low if 
P< 1. 


When the total SNR is high, then from Eqs. (41l-(42l 


we immediately see that the covariance between the clus¬ 
terings of different types of tracers (the off-diagonal terms) 
Is positive, in fact = C'ou(P^,P^) —>■ 2P^Pi,. In relative 

terms, the covariance in that limit is constant for all tracers, 
Cfiv/[Pij,Pv) —t 2. This is simply cosmic variance. 

In the converse limit, of very low total SNR, the cross¬ 
covariance becomes negative, —>■ —2P^P„/P^ ^ v). 

In relative terms, the covariance in this limit is also inde¬ 
pendent of the tracer species, as it happens in the high-SNR 
limit, but now C/{Pf^PA —2/P^. 


4.3 Tracers with low SNR 

An obvious situation of interest arises when some tracer has 
low SNR. This can happen if the tracer is sparse (h^ < 
10“®), or has a very small bias (6^ <C 1), making its clus¬ 
tering strength P^j very low in some bin or bandpoweij^ 
The danger would be that the inverse of the Eisher matrix 


® Notice that the values of the power spectra of the tracers, = 
B^Pm, should never actually vanish. If they do, in some sense 
(e.g., on extremely large or small scales), then P 0, making 
the entire Fisher matrix vanish for that bin — as it should indeed 
happen in this case. 


(the covariance matrix), which enters in the multi-tracer es¬ 
timators through Eq. (381, could propagate this noise to the 
estimation of the spectra for the other tracers. 

However, this is not the case, as can be seen from the ex¬ 
pression for the covariance matrix in Eqs. (411-(42 I: because 
the reciprocals of the individual clustering strengths (i.e., 
the noises) only appear in the diagonal terms of the covari¬ 
ance matrix, if one of the tracers has a very high noise, this 
will only affect that same tracer. In particular, this means 
that our estimators are robust even when a galaxy survey 
includes tracers whose SNR are small. 

This feature is very convenient if one would like to 
split a survey into several sub-surveys, by dividing galax¬ 
ies, quasars and other objects into different categories ac¬ 
cording to type, luminosity, color, morphology, etc. - all of 
which may be indicators of the bias of those tracers. In do¬ 
ing that, even though the total SNR of the survey should 
remain approximately constant, the SNR of each individual 
tracer would decrease, leading us to wonder whether this 
could lead to a degradation of the information derived from 
that survey. However, the fact that a tracer with low SNR 
only affects its own estimator means that this strategy can 
be safely used even when some tracers have very low number 
densities. 


4.4 Shot noise and the 1-halo term 


A fundamental assumption in our derivations has been that 
the covariance of the counts of the tracers is given by Eq. 
(19l. However, this is often a simplification. 

First, the statistics of counts in cells for galaxies in a 
redshift survey is only approximately Poissonian, so shot 
noise may be very different from the usual l/h^j. Moreover, 
besides the 2-halo term which usually dominates on large 
scales, there is an additional contribution to the power spec¬ 
trum from the 1-halo term ( Cooray fc Sheth]|2002 1. In the 
A: —>■ 0 limit the 1-halo term is effectively an additional con¬ 
tribution to shot noise. In principle, any such corrections can 
be fixed simply by allowing for a more general form of shot 
noise for each tracer which, in the limit of negligible 1-halo 
term and Poisson statistics, reduces to 5^1./h^. 

A closely related problem arises when different types of 
tracers occupy the same dark matter halos. Eq. ( |19[ ) states 
that the covariance between counts of different types of trac¬ 
ers do not have any shot noise. However, the Halo Model 
specifies that even for galaxies of different types there is 
a non-vanishing 1-halo term, which is degenerate with shot 
noise in the fe —> 0 limit. Usually this is a small contribution, 
subdominant to the shot noise of the individual tracers, but 
it ultimately means that the noise cannot be assumed to be 
diagonal in the tracers. 

A third, and perhaps more serius problem, arises from 
that fact that different tracers are often found to inhabit ha¬ 
los of very similar masses. Most galaxies (as well as quasars) 
are found in halos of masses in the range Mq < 

Mh < Mq, with relatively small differences between 

the distributions of each type of object within halos — the 


so-called halo occupation distributions, or HODs (Martinez 
& Saar 2001| Cooray & Sheth 20021. In particular, this 
means that the biases of those tracers are not entirely inde¬ 
pendent. 

In other words, different tracers can be correlated by 
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more than just the underlying dark matter field. These cor¬ 
relations arise through the 1-halo terms of the power spec¬ 
tra, which contribute to the covariances of the counts of 
those tracers, as well as through additional contributions 
to the bispectrum and trispectrum. But the trispectrum 
also defines the covariance of the power spectra through 
{Pfi(k)Pu{k')) ~ which means that it 

is not possible to assume that the trispectrum is given only 
by the connected pieces of the 4-point function — i.e., it is 
not true anymore that {SuS^SaS^} = + C^aC^p -I- 

It is straightforward to incorporate the 1-halo term sys¬ 
tematically into the covariance in all our calculations (see 
Section]^. However, if there are significant correlations be¬ 
tween the power spectra arising from the 1-, 2- and 3-halo 
terms of the trispectrum, then the counts cannot be assumed 
to be nearly Gaussian. In that case it would be erroneous 
to assume that the tracers are truly independent, and a key 
assumption of our method would be undermined. Neverthe¬ 


unit of phase space for this new set of parameters is: 


less. Smith & Marian (20151 were able to extend the PVP 


method (which does not rely on a direct construction based 
on the pixel covariance, but on variational methods) to in¬ 
corporate these contributions from the Halo Model in formal 
expressions for the weights and for the Fisher matrix. How¬ 
ever, recall that the PVP method, as well as its extension 


by Smith & Marian (20151, only tackle the estimation of the 


matter power spectrum, after assuming that the bias, RSDs, 
NGs, etc., are known and fixed. 


4.5 Degenerate tracers 

While tracers with different biases can possess correlations 
beyond those associated with the large-scale structure of 
the Universe, it is not necessarily true that two tracers that 
have similar biases must be highly correlated. Two types of 
galaxies may have different HODs, but their biases could 
coincide. In those cases, if there is a significant contribu¬ 
tion from the 1-halo term, then it may still make sense to 
treat those species separately. It is only when two tracers 
have the same HOD (or, equivalently, the same bias, 1-halo 
term, 2-halo term, 3-halo term, etc.), that they should be 
consolidated into a single species. 

However, suppose we do not know whether or not two 
types of galaxies have the same HODs. If we use the multi¬ 
tracer approach and treat those two species as if they were 
different tracers, but they turn out to have the same HODs, 
would that initial assumption imply an overestimate of the 
information, or some distortion in the estimators? 

The answer is no, and this follows from a very inter¬ 
esting property of the multi-tracer Fisher matrix. As shown 
in Abramo & Leonard (20131, the Fisher matrix can be di¬ 


agonalized by changing variables, from the original power 
spectra = P^Pm to a new set of parameters which 
correspond to the total clustering strength and certain ra¬ 
tios between the power spectra — the relative clustering 
strengths. In the case of two tracers with spectra Pi and P 2 , 
a choice of parameters which diagonalizes the Fisher matrix 
is logP — Vi +V 2 , and log7?. = logPi/p 2 (or, equivalently, 
logP and logp 2 /Pi = — logP). The Fisher information per 


P[logP,logP] = 


1 

^ I 2 (1+-P)2 

0 


0 

1 n 

4 XT+WjJT+Wp 


(73) 


For an arbitrary number N of tracers, the change of 
variables that diagonalizes the Fisher matrix is identical to 
a change from Gartesian coordinates to spherical coordi¬ 
nates in N dimensions. Namely, if we regard the N clus¬ 
tering strengths as Pi —>■ x\, V 2 —>■ etc., then the 

variables that diagonalize the Fisher matrix are the ra¬ 
dius, P —>■ together with the {N — 1) angles 

tan^ 6 = (r^ — x%)lx%, cot^ tfi = — x%_i)/x%_i, 

coF (j )2 = (r^ — x% — x%_i — x%_ 2 )/x%_ 2 , etc. Hence, the 
angle variables correspond to certain ratios between the trac¬ 
ers, (or relative clustering strengths), for which the matter 
power spectrum (the radius) cancels out. In particular, this 
means that the relative clustering strengths are immune to 
some statistical limitations that affect the matter power — 
namely, the relative clustering strengths can be measured 
to an accuracy which is not constrained by cosmic variance 


Abramo & Leonard (20131. 


Coming back to our example of the two tracers, if we 
now stipulate that they are in fact a single species, then 
Pi = P 2 , and P —>■ rii/h 2 is not a free parameter anymore, 
so dlogP —>■ 0. This is equivalent to projecting the 2x2 
Fisher matrix into a single component, thus eliminating the 
line and column corresponding to logP, and leaving logP 
as the sole free parameter. Indeed, since dlogP —>■ 0 in this 
case, we cannot constrain physical parameters such as RSDs 
or NGs on the basis of a measurement of P. 

Since the Fisher matrix is diagonal, the Fisher infor¬ 
mation for logP is unchanged after this projection (or 
marginalization). In particular, the variance (T^(logP) = 
(j^(P)/P^ is untouched by a marginalization over P, and 
it is still given by the inverse of the same Fisher matrix el¬ 
ement in Eq. ( |73[ ), so o'^(P) = 2(1 -|- P)^, which is nothing 
but the covariance (in units of phase space volume) for a 
single tracer species — see Eq. l |17[ ). 

The argument above extends to any number of tracers: 
since the Fisher matrix is diagonal in the “spherical coor¬ 
dinates” (the total and relative clustering strengths), pro¬ 
jecting some of the tracers out by combining them into new 
species does nothing to the Fisher information of the total 
clustering strength, or to the relative clustering strengths of 
the remaining species. Therefore, in principle there is no dif¬ 
ference between treating two identical tracer species (with 
the same HODs) separately, or joining them into a single 
type of tracer. Of course, one can always destroy informa¬ 
tion by treating two different tracer species as if they were 
just one, but there is no penalty for breaking a catalog into 
as many sub-catalogs as one wishes — even if some of the 
tracers turn out to be completely degenerate. 

The argument is a bit more involved if we work with 
the power spectra as the parameters, but the conclusion is 
the same (see Appendix A). 


5 TESTING THE ESTIMATORS 

In Sections 2 and 3 we derived the optimal multi-tracer esti¬ 
mators. We also obtained the covariance of the estimators — 
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Case 

ni {h^ Mpc 

bi 

fl2 (h-^ Mpc-5) 

^2 

A 

1.10“^ 

1.0 

1.10-2 

1.2 

B 

1.10-2 

1.0 

1.10-5 

1.2 

C 

1. 10-5 

1.0 

1. 10-5 

1.2 


Table 1. The three cases we use to illustrate the application of 
the multi-tracer method. In all cases tracer 1 has bias bi = 1.0, 
and tracer 2 has bias 62 = 1-2. In case A the two tracers have 
high number densities, so the signal-to-noise is high. In case B 
tracer 1 is dense, but tracer 2 is sparse. In case C both tracers 
are sparse, so the signal-to-noise is low. 


which is simply the inverse of the multi-tracer Fisher matrix. 
In this Section we apply that formalism to simple simulated 
galaxy maps. The implementation of the estimators is quite 
straightforward, and should be familiar to anyone who has 
used the FKP or the PVP methods. Although we test the 
method in real space, the extension to redshift space is triv¬ 
ial: instead of bins in |A:|, one should have bins both in k and 
in til. 

For the generation of the galaxy maps we chose a sim¬ 
ple method that is both efficient and computationally cheap 
enough that hundreds of realizations of a single fiducial mat¬ 
ter power spectrum and galaxy model can be analyzed. We 
implemented the multi-tracer estimators in a cubic grid with 
constant, uniform mean number density (or selection func¬ 
tion), for the case of two different species of tracers, with 
biases b\ = 1.0 and 62 = 1-2. We checked that the esti¬ 
mators are as robust as the FKP or PVP methods against 
variations in the survey geometry. 

In order to test the performance of the estimators in 
situations of high or low signal-to-noise, we consider three 
different cases, as shown in Table 1. In each case we generate 
1000 galaxy maps (each map consisting of two catalogs, one 
for each tracer), and estimate the spectra using the methods 
described in Sec. 3. 


5.1 Lognormal maps 


Our mocks follow the same procedure used in, e.g., PVP. 
A detailed description of the generation of lognormal maps 


can be found in Coles & Jones (19911. The basic idea is that 
a Gaussian density contrast (T) is not bounded from 
below, which implies that negative values for the density 
are possible in any finite-volume realization of such a Gaus¬ 
sian field. Lognormal fields, on the other hand, are positive- 
definite, so we map the Gaussian field into a lognormal field. 

A lognormal field obeys the condition S^^^(x) > — 1 
and approximately describes the non-linear density field at 
low redshifts. We can obtain a lognormal density field in 
terms of a Gaussian density field through the definition 
l-b(5(^)(f) = exp[(5^®^(a:) — (Jq/2], where ctq is the variance 
of the Gaussian field inside a cell. The Gaussian correla¬ 
tion function is related to the physical (assumed lognormal) 
correlation function by = ln[l + Given a 

fiducial cosmology, we obtain the z = 0 matt er power spec- 
trum Pm{k) from the Boltzmann code CAMBp] |Lewis, Challi 
nor & Lasenby 2000[ ), and inverse-Fourier transform it to 
get the physical correlation function We then con¬ 

vert the physical (assumed lognormal) correlation function 


^ http://CAMB.info 


to the correlation function of the corresponding Gaussian 
field, and Fourier-transform that correlation function into a 
power spectrum for the Gaussian field. This is the power 
spectrum which is employed to generate the Gaussian ran¬ 
dom modes for the density contrast. 

The next step is the generation of biased lognormal 
maps for each galaxy type. We define the l^normal maps 
as 1 -b SlL\x) = exp[6^ (5^®^ (a;) — 6 ^o-g/2] |j ■ Finally, we 
create the galaxy maps as independent Poisson realizations 
over the lognormal fields. Each tracer has its own spa¬ 
tial number density n^{x) and bias b^, so that the maps 
for each tracer are given by integer numbers for each cell 
of volume dV in our cube through a Poisson sampling, 
Nfj,{x) t— W{nfj,{x)[l + SlL\x)]dV}, where P{A} is a Pois¬ 
son distribution with mean A. 

In the three cases detailed above we considered cubic 
256^ grids with a fiducial cosmology characterized by a flat 
AGDM model with = 0.0226, ^cDivib? = 0.112 and 
h — 0.72. Each cube has a physical (comoving) volume 
of (1280/i~^Mpc)®. It is important to note that lognormal 
maps created this way do not show the usual effect of sup¬ 
pression in power at small scales when a smoothing algo¬ 
rithm is applied to convert from a continuous distribution 
to a discrete grid, such as Nearest Grid Point (NGP). In any 
case, the formalism is general enough to accommodate this 
necessity. Furthermore, since the grid used is cubic, it is un¬ 
necessary to deconvolve the estimated spectra from the win¬ 
dow function. Even though any discretization scheme could 
be used, the square grid is required in order to employ an 
implementation in terms of a fast Fourier Transform (FFt), 
which is, as a matter of fact, the only practical way to per¬ 
form a Fourier analysis of large data sets. 


5.2 The data analysis algorithm 

With the galaxy maps n^(a:) as input, along with an initial 
guess for the biases we can start to deploy the machin¬ 
ery developed in Secs. 2 and 3. A previous step, in case we 
had not explicitly generated maps with constant, uniform 
number densities, would be to estimate h^(i;). 

We start by constructing random maps, n()(a:), for each 
tracer as a Poisson process, in each cell of the grid, with 
the same shape for the mean number density as the data 
(i.e., the real maps), but with a larger number of par¬ 
ticles, hj) = h^/a^, where are small constants. We 
then construct the density contrasts according to Eq. ( |59[ ): 
Sfi{x) = (n)) — Afj. nA/Ufj, — where recall that are con¬ 
stants found according to the discussion in Sec. |3.3| 

With an initial guess for the biases and for the am¬ 
plitude of the power spectrum, we can construct Vfi and 
P = 5/)^ Pfj,, plug them into the weights |33|, and calculate 
the weighted density constrasts of Eq. ( |59| | . We then perform 
an FFt over f{x) and ffj.{x), in order to obtain the integrand 

® Notice that, for a lognormal map with bias b, the correlation 
function used in the generation of the Gaussian random modes 
should be defined as (x) = b~^ ln[l-b Therefore, 

strictly speaking, this prescription only is self-consistend when 
there is a single type of galaxy, with one bias. However, using 
the same correlation function for tracers of different biases intro¬ 
duces only a small spectral distortion on small scales, which we 
corrected for in our simulations. 
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Figure 1. Estimated auto-spectra v. real auto-spectra. Filled 
(red) circles correspond to the power spectrum of the tracer 1, 
with bi = 1.0, and filled (blue) squares correspond to the esti¬ 
mated power spectrum of the tracer 2, with bias i »2 = 1.2. The 
symbols and error bars correspond to the mean and to the vari¬ 
ance, respectively, of 1000 realizations. The dased lines are the 
input (theoretical) spectra of the tracers, given their biases and 
our fiducial cosmology. The upper, middle and lower panels corre¬ 
spond to cases A, B and C, respectively (see Table 1). The error 
bars are the theoretical ones — i.e., the inverse of the Fisher 
matrix, Eq. ( |25| |. 


of Eq. (361. Taking proper care of the volume factors (in real 
and in Fourier space), this step should be analogous to the 
average over modes in Eq. (2.4.5) of FKP. 

The next step is to subtract the biases of the estimators 
— the SQfij in Eq. (361 or, equivalently, Eq. l |67[ ). Assuming 
that averages over bins are such that {AB)i ^ {A)i {B)i, 
and taking a single value for all the —)• a, Eq. (401 can 
be rearranged to yield: 

1 + a f d^x d^k 




L 


(27r)3 (1+P)2 


(74) 


With our choice of a = 10“®, we find that Afj, —>■ a to an 
excellent approximation, which means that the biases of the 
estimators are given only by Eq. (741 — see Sec. 3.3 Finally, 
the estimated power spectra are computed with the help of 
Eq. 

We present our results for the estimated spectra of two 


types of tracers in three cases. A, B and C — see Table 1 
and Fig.[^ Case A represents a low-redshift survey which is 
highly complete, so both tracers are dense. Case B represents 
a low- or intermediate-redshift survey, with one dense species 
of tracer (type 1 — say, red galaxies) and one sparse species 
of tracer (type 2 — say, quasars). Case C represents a high- 
redshift survey, with two sparse types of tracers. 

Our estimates were evaluated in evenly separated band- 
powers with Afe = 0.005/1 Mpc“^. We show the estimated 
spectra in Fig.[^ only up to /c = 0.2 h Mpc“^ — slightly into 
the nonlinear regime but still below the Nyquist frequency, 
such that our results are not affected by discretization ef¬ 
fects. When estimating the spectra we adopted a commonly 
used simplification, which is to fix the value of the matter 


power spectrum that is used in the weights, Eq. (331 — in 
our case, we found that fixing Pm —> 10^ h~^ Mpc'^ in the 


weights was a suitable choice. Our results did not change 
significantly over the dynamical range of interest when that 
value was multiplied by 2 or by 1/2. 


5.3 Empirical v. theoretical covariances 


We now check whether the theoretical covariance matrix 
(the inverse of the multi-tracer Fisher matrix) is a good ap¬ 
proximation to the true (i.e., empirical) covariance matrix. 
If the theory is accurate, then the method is validated; if it 
is not, then the multi-tracer estimators are sub-optimal. 

The empirical result was obtained from 1000 realiza¬ 
tions. This was compared with the theoretical covariance — 


i.e., the inverse of the binned Fisher matrix of Eq. (25l: 


CoV^Pf^^i^ Pi/^j') — 


-/ 

1^,1 J V. 


<Fx(Fk^ 

(27r)3 


(75) 


where 7^^,^ was defined in Eq. (26l. 


In Fig. 1^ we present the comparison between the the¬ 
oretical and empirical covariances for the auto-spectra of 


the two species, obtained respectively from Eq. (751 and 


from taking the standard deviation of 1000 lognormal re¬ 
alizations. We find that our theoretical expression properly 
reproduces the behavior of the statistical fluctuations in all 
cases, matching more closely the variances when compared 
with the FKP method. The theoretical variances sometime 
underestimate slightly the empirical variance, which is con¬ 
sistent with the notion that the inverse of the Fisher matrix 
is an underestimate of the true covariance. This is in line 
with what is usually found in implementations of the FKP 
method. In cases B and C the multi-tracer estimator per¬ 
forms significantly better than the FKP estimator on all 
scales. 

In Fig. 1^ we compare the theoretical and empirical vari¬ 
ances for the cross-spectra of the two tracers (green trian¬ 
gles), and for the ratios of the two spectra, P 1 /P 2 (black 
diamonds). Since the FKP method cannot predict theo¬ 
retical covariances in these two cases, we only show the 
multi-tracer theoretical variances. The theoretical variance 
for the ratio P 1 /P 2 follows from the multi-tracer Fisher in¬ 
formation matrix, Eq. (261, which can be diagonalized by 
a change of variables ( [Abramo fc Leonard|2013 1, where the 
new parameters (the “eigenvectors” of the Fisher matrix) 
are not the individual clustering strengths "P^, but the to¬ 
tal clustering strength, P, and certain ratios between the 
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Figure 2. Theoretical v. empirical relative covariances of the 
auto-spectra, Cov(Pn^i, The upper, middle and 

lower panels correspond to cases A, B and C, respectively (see 
Table 1). Red circles and blue squares correspond to the theoret¬ 
ical covariances of the tracers 1 and 2, respectively. The lines of 
the same colors are the standard deviation of our 1000 lognormal 
mocks. Solid symbols and lines correspond to multi-tracer esti¬ 
mates, while open symbols and dashed lines correspond to FKP 
estimates. In case A (upper panel), since the two tracers have 
high signal-to-noise (both Pi ^ 1 and P 2 ^ 1 in this range of 
scales), both the multi-tracer and the FKP formulas for the auto¬ 
covariances re duce to Cov{Pu,i, P^.j)/PuAPi/.i — ‘^iyx,iVk i ~ 
k~‘^ [see Eqs. -( |42[ |]. Hence, in this case most symbols and 
lines overlap. In most cases, the empirical covariances are slightly 
higher than the theoretical ones — as expected. In case B (middle 
panel), the covariance of spectrum of the sparse tracer species is 
significantly higher in the FKP method: in this case, the multi¬ 
tracer method reduces the uncertainty in the spectrum by a large 
factor. 


clustering strengths. In particular, a diagonal Fisher ma¬ 
trix means that the degrees of freedom are independent 
— there are no cross-covariances. For two types of trac¬ 
ers, the variables which diagonalize the 2x2 Fisher ma¬ 
trix are P = Pi + P 2 ^ and P 1 /P 2 (or, equivalently, P and 
V 2 IV 1 ). As shown in Abramo & Leonard (20131, the Fisher 
matrix per unit of phase space volume for log(Pi/P 2 ) is 
Pratio = Pi P2/4(1 + Pi + P 2 ), Lom which follows that the 
relative covariance of that ratio is [J x k/{27i)^Fra,tio\~^• 
This figure demonstrates the power of the multi-tracer tech- 


Figure 3. Theoretical v. empirical covariances of the cross¬ 
spectra and of the ratios between the spectra. The ratios were 
defined as P 1 /P 2 (the relative covariance is identical for P 2 /Pi). 
The upper, middle and lower panels correspond to cases A, B 
and C, respectively (see Table 1). Diamonds (black) correspond 
to the theoretical relative covariances of the cross-spectra, while 
triangles (green) correspond to the theoretical covariance for the 
ratios between the spectra (see text). The solid lines correspond 
to the empirical covariances, using the multi-tracer estimators 
(we do not show the results using the FKP estimator in these 
plots because it performs significantly worse compared with the 
multi-tracer estimators, and in any case the FKP method does 
not predict these covariances). Notice that in case C (lower panel) 
the covariance of the cross-correlations is negative, since P < 1 — 
see Eqs.( |41[ |-( |42| |. Notice also that in case A the ratio between the 
spectra has a much lower uncertainty than the cross-correlation 
(for an explanation, see the text). 


nique to measure P 1 /P 2 = Bl{z,k, fLk)/fLk)^ some¬ 
thing that can be used to place stronger constraints not only 
the biases of the two species, but also on RSDs, NGs, etc. 

The upper panel of Fig. shows the covariance matrix 
for tracer 2 (62 = 1-2) in case B — i.e., /i^j). We 

exploited the symmetry of the covariance matrix under ki ^ 
kj in order to compare the multi-tracer and FKP estimators 
directly. In the lower panel of this figure we show the corre¬ 
lation matrix, defined as CoTVij = Covij / Covu Covjj . We 
find that both the multi-tracer and the FKP estimators yield 
roughly similar correlation matrices, with weakly correlated 
bins up to scales k < 0.1 h Mpc“^. 

The upper panel of Fig. ^ together with the middle 
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Figure 4. Upper panel: covariance matrix for tracer 2 in case B. 
The upper triangle is the result using the multi-tracer estimator, 
and the lower triangle results from using the FKP estimator. The 
multi-tracer technique performs significantly better on all scales. 
Lower panel: correlation matrix for tracer 2 in case A. Both the 
multi-tracer and the FKP estimators perform similarly regarding 
the correlations between Fourier bins. We checked that the esti¬ 
mators result in similar correlation matrices for both tracers, in 
the three different cases we analyzed. 


panels of Figs. and shows that in case B the multi¬ 
tracer estimator performs signihcantly better than the FKP 
estimator at all scales, with uncertainties up to one order of 
magnitude smaller for the spectrum of the sparse tracer. The 
multi-tracer technique is also clearly superior in estimating 
the auto-spectra in case C, when both tracers are sparse — 
see the lower panel of Fig. 

6 INCLUDING THE 1-HALO TERM 

The fundamental object in this paper, which was used to 
derive the Fisher information matrix, as well as the opti¬ 
mal weights, is the pixel covariance. In the limit where bias 
and RSDs depend weakly on k, the covariance can be ap¬ 
proximated by Eq. ( |20| |. However, this is not a complete 
description: in addition to the “signal”, Pa = BaPm, and 
the shot noise, Sajs/fia, there is another source of correla¬ 
tions between the density contrasts of different species of 


tracers at different points in space: the 1-halo term of the 


power spectrum. According to the Halo Model (Cooray & 


Sheth 20021, dark matter halos are the genuine tracers of 


the underlying matter density, while galaxies only trace the 
halos. In particular, this means that many galaxies may be 
hosted by the same halo, in which case they would be trac¬ 
ing the same features of the underlying fluctuations of the 
matter density. 

This additional covariance between galaxy counts is ex¬ 
pressed by the 1-halo term: 


PaUk) = 




dlnM u^{k\M) {NaNp)M 


(76) 

where duh/dlnM is the mass function for halos of mass M, 
Na is the number of galaxies of type a, and u{k\M) is the 
Fourier transform of the halo profile ( Cooray fc Sheth|2002 1 . 
The expectation value is over the probability distribution 
function for the numbers of galaxies (the HOD) at a given 
halo mass. For the species of tracers which are typically used 
in cosmological surveys the 1-halo term is only relevant on 
small scales (fc > 1 h/Mpc) — although, since u{k —>■ 0) = 1, 
it still contributes a constant factor on large scales. 

Inclusion of the 1-halo term would lead the approxi¬ 


mated pixel covariance of Eq. (201 to assume the expression: 


Ca/3(f, 5') —>• 5d{x,x') X 


. r)2h I -plh 

— -r l^aB + ^aB 

ria 


(77) 


where we write the 2-halo term Pa^ = BaBpPm- In princi¬ 
ple, if we are only interested on the properties of the cluster¬ 
ing on large scales, this term can be included systematically, 
in every step of the calculations — see also |Hamaus et al.| 
( [M0| ), in a similar context. These are straightforward com¬ 
putations, but for a general form of there is no closed- 
form expression for the inverse of the pixel covariance ma¬ 
trix, which means that we cannot give explicit formulas for 
the Fisher matrix, the weights, the window functions, etc. 


6.1 Fisher matrix of the 2-halo term for separable 
1-halo terms 


In some cases the populations of tracers are such that the 
1-halo term is approximately separable, i.e., it can be ex¬ 
pressed as a direct product of two terms, Plfj ~ HaHp — 
just as happens with the 2-halo term. We have checked that, 
for a class of HODs that is commonly used to describe red 
and blue galaxies (Zheng et al. 20051, all entries of the corre¬ 
lation matrix Pap/PaaP^p very close to unity, which 


justifies this approximation. However, we only verihed this 
feature of the 1-halo term while ignoring the distinction be¬ 
tween central and satellite galaxies, since it is not clear how 
to generalize (NaNp) in that case. It would be interesting 
to find out whether this property holds for more realistic 
HODs. 

If the 1-halo term is separable, it turns out that we 
can invert the covariance matrix. This result follows from 
the exquisite properties of matrices that can be written as 
Map = Sap + VaVp UaUp. This type of matrix appeared 
already in Section 2, where we showed that the inverse of 

M^^aP = SaP+VaVp is given by = Sap-VaVp/(l + V^), 

where 
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As shown in Appendix B, the inverse of the matrix 
Map = 5aP + VaVp + UaUp is: 

= E - (78) 

where M~^Jp = 5ap - VaVp/{l + + vTTw^), and u'a = 

E„ M-X^Uf,. 


After some algebra, using Eq. (78 1 we can express the 
iriance, E 

x') 


inverse of the covariance, Eq. (771, as: 

C“j(xE') 

p2h 


(79) 


j. _ _ * ct/3 + PaP + 'PaP _ 

Oct^'^a ^ ^ ^ '^/3 


where the cross-term is: 
E«/3 = ( 


pZn pin I pin pz 
^ OLll i-L^ ' ^ ocfi -^110 


(80) 


r)2h •plh rylh •p2h, 
-^ck/S -^a0 


). 


and the term appearing of the denominator in Eq. (791 is: 

T = (-Pm^ + pX) + E ^>^^ApXpX - P; 


i2h pl/i N 

) • 


(81) 

Compare this result with Eq. \22\ . We detect some familiar 
expressions, in particular: 


p — 'y — y ) p(I — y ) ■ 


(82) 


It is now useful to rename the clustering strength of the 2- 
halo term as —>■ "P^^, V —>■ and to define the 1 -halo 

clustering strength as = Y^PX- The 

cross-terms mixing the 1 -halo and the 2 -halo terms appear 
In the combinations: 


pc — \ ' p 2 ^j r- p 

-^0.0 - / ^ ^OL^ ^f. 


\h 

fip 1 


(83) 


/T^c. _ — — i~\2h 

PaP — r^aXlp Pap PaP • 


Once again, we find it useful to define the dimensionless 
clustering strengths of these cross-terms, as was done for 
the 2-halo and the 1-halo terms. They are: 


Va = riaPaa 

V 


= E7’^ 


(84) 


= XZ = E ■ 

a OL^ 

With these definitions we find that: 


1 /j^2h>j^\h 

(85) 

T^2h .-p) 1 j~)Xh >-r^2h 

^al3 r — r 

( 86 ) 


Similarly, we get: 

Yap = PX + PI 

The Fisher matrix was defined in a generic sense in Eq. 
(18l. That definition, as well as the construction of the op¬ 
timal quadratic estimators, are valid for any Gaussian vari¬ 


ables (Tegmark et al. 19981. In a related result. Smith & 


Marian (20151 recently derived an optimal estimator for the 


matter power spectrum, as well as the Fisher matrix for the 
power spectrum, including not only the 1 -halo term, but also 
the 2- and 3-halo contributions to the trispectrum — most 
of which are, strictly speaking, non-Gaussian contributions. 


We did include the 1-halo term in the pixel covariance, as 
well as in the trispectrum, but only through the assump¬ 
tion of Gaussianity of the 4-polnt function. Due to the non- 
Gaussian terms that will appear in the trispectrum, our esti¬ 
mators are not exactly optimal. Nevertheless, in some sense 
our result are more general than those of [Smith fc Marlan| 
(20151, since the multi-tracer estimators can be employed 


not only in the computation of the matter power spectrum, 
but also for the biases and the RSDs. 

Since we keep the assumption of Gaussianity, all we 
have to do is work out the algebra with the covariance of 


Eq. (771, and its inverse, given by Eq. (79l. After a lengthy 


calculation, we find that the Fisher matrix which generalizes 


the expression in the integrand of Eq. (25 \ can be expressed 


-p2h _ 


X (1 -E T) - (1 + vXpXpI 

+ (1 + p^ZpY + pfp"- + pXpZ\ 
+ {i + v^pX - {P"fPY 

_ (1 + pi'*) {vXvi + vXvAi I . 


4(i-Er)2 


(87) 


It can be easily verified that taking P^ —>■ 0 implies PE 
0, T —^ , etc., and this expression reduces to the matrix 


Pfiv which is inside the integral in Eq. (25l. The matrix is 


also manifestly symmetric. 

It can also be shown that this Fisher matrix is positive- 
definite, with positive diagonal terms and a positive deter¬ 
minant. This guarantees that the covariance of the 2-halo 
power spectrum is also positive-definite. 

We should stress once again that the expression above is 
only valid in the approximation that the 1 -halo term is sep¬ 
arable, i.e., PXI\JPa^Ppp — ^ap- For a general form of the 
1 -halo term, the pixel covariance matrix cannot be inverted 
analytically, which means that there are no closed-form ex¬ 
pressions for the Fisher matrix or for the optimal weights. 
One could still go ahead and compute them numerically, 
without any difficulty. 


6.2 Fisher matrix and optimal weights for the 
1-halo term 

Now, suppose that what we are in fact interested in mea¬ 
suring the 1-halo term. The 1-halo term Is now the “signal”, 
while the the 2 -halo term, as well as shot noise, become the 
“noise”. This should In fact be the case for very small scales 
(fe ';$> 1/i/Mpc), where the 1-halo term dominates over the 
2-halo term ( Gooray fc Sheth|2002 |. 

The pixel covariance is still the same, as in Eq. (771, 
and, as long as the 1 -halo term is separable, the in¬ 
verse covariance is also unaltered — see Eq. (791. The 


basic difference is that now instead of writing P^ = 
Ba{z, k, fik)Bpiz, k, iJ,k)Pm{k), we assume that the 1-halo 
term can be written effectively as something like P^ = 
Ha{z, k, jj,k)Hp{z, k, fik)U{k), where U{k) contains informa¬ 
tion about the shape of the mean halo profile. This is a 
strong assumption: it means that, for N species of trac¬ 
ers, the 1-halo term would have only N degrees of free- 
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dom (the = H^U), while the full expression in fact 
has N{N — l)/2 degrees of freedom. 

Keeping the hypothesis that the 1-halo term is separa¬ 
ble, all we need to do now, in order to hnd its Fisher matrix, 
is to exchange all the 2-halo terms by the 1-halo terms in Eq. 
(87l. This procedure can also be used to define the optimal 
weights that ought to be used when extracting information 
about the 1-halo term from galaxy surveys. 

Since many of the objects dehned above are already 


symmetric under the exchange (this includes T, 

and V^u), the 1-halo Fisher matrix can be immediately 
written as: 


-^•1 ri _ 

11.1/ 


X + t) - (1 -f 

-f (1 -b -b 

-b (1 + 

- (1 -b p"'*) + Pi} . 


( 88 ) 


For very small scales the 2-halo term can be neglected, and 
we are left just with the 1-halo terms. 


The Fisher matrix in bins of k is just as in Eq. (25 I: 




L 


cPx(fk 
(27r)3 "" 


(89) 


The optimal weights follow in a straightforward manner 
from this expression, just as was done for the 2-halo term. 


total contribution, i.e.: 
dCap{x, x') 


dP* . 


dCap{x,x') dPjf-i dCap{x,x') 


Ml* 


/ 


(Pk 




ap*. 

Ml* 


dP}f 

MM 


(27r)3 

Ba{x,k)Bp{x' ,k) 
2Bl{xi,ki) 
H^ix, k)Hp{x',k) 




2Hl{xi,k,) 


(90) 


Substitution of this expression, together with the inverse of 


the pixel covariance, into Eq.(18l, leads to the full Fisher 


matrix for the power spectrum. This can be written as: 


P* . . 


= <5. 


f. 


+ 


cPx (Pk 

1 

p2h plh 
vu,i 


T 

•/ II. 


p, 


(p2h ^^2 

iii/.i 


+ 


)^ 


n. 


(91) 


p2h plh 
i/v',i'^ MMM 


where contains the cross-terms between the 1-halo and 
the 2-halo terms which follo w fr om Eq. (90l. Expressing 
the inverse covariance of Eq. (791 in terms of C~^{x, x') = 
5D{x,x')Dap, the information mixing between the 1-halo 
and 2-halo terms is given by: 


F = 


I D 

' fll/ 


OC0 

+ Pi^D^^Pl^pDp, + PltD^^Pl*^pDpf) . (92) 

It is trivial to obtain the full expression, although it turns 
out to be rather long. It can be significantly simplified if we 
employ two additional auxiliary definitions: 


r^l,2h _ X 

^— / . -^iirv ^ 


l,2/i , 

[lOL ^OLU 1 


(93) 


6.3 Joint Fisher matrix for the 2-halo and 1-halo 
terms 

The next obvious question is: what if we wish to estimate 
both the 2-halo and the 1-halo terms in a multi-tracer cos¬ 
mological survey, simultaneously? The two contributions are 
clearly correlated, so their information contents are not inde¬ 
pendent. Evidently, on either very large or very small scales 
the correlations between the two are small, and one can treat 
the signal (P^'^ on large scales; P^^ on small scales) as ef¬ 
fectively independent of the noise. However, on intermediate 
scales (around fc ~ 1 /i/Mpc) the 1-halo and the 2-halo terms 
may have significant correlations. Furthermore, the approx¬ 
imation of separable 1-halo term becomes more accurate on 
those intermediate scales. 


The pixel covariance is still given by Eq. (771, and its in¬ 


verse also remains unchanged, but we would now be consid¬ 
ering our “signal” as the sum P^ = P^^-bP^^. The main dif¬ 
ference is that the derivatives of the pixel covariance, which 
in the case when we neglected the 1-halo term were given 


^ ^ fj,i/ — fiL/ -*-^M 


In terms of these variables we have, e.g.: 
flu 


ry2h _ 


1 + 


^ ^.p2h ^ p2h p 


and 


WfY = 




(1 - py Pl'^Pl 


(94) 

(95) 

(96) 


l-bT 

, + Pl^py, + (1 - p"7 p^. 

^ 1 + r 

as well as the analogous expressions obtained by exchanging 
Ih ^ 2h. In terms of these definitions we have: 


T’^ = 

LLU 


E f r^2h rylh . r^lh r^2h'\ 
I ^Oil/ “r ^^oc^au j 


where Z = fact, these definitions 

are also helpful when computing Tff (taking all Z —>■ Z'^^ 


by Eq. (241, should now be computed with respect to this 


and W W^'‘) and (taking all Z -f Z^ 
IF^'^). 


and IF 
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7 CONCLUSIONS 


We have obtained optimal estimators for the Fourier analy¬ 
sis of multi-tracer cosmological surveys. The formulas were 
derived in Sec. 3, and a practical algorithm for the Fourier 
analysis of multi-tracer surveys was summarized in Sec. |5.2[ 
Those are the main results of this paper. 

The multi-tracer technique estimates the individual 
redshift-space power spectra for each tracer, Pa{z, k, fik), 
taking into account the covariance between the tracers which 
is induced by the large-scale structure. In contrast to the 


estimators obtained by Percival et al. (20031 or Smith & 


Marian (20151, which are suited for estimating the underly¬ 


ing matter power spectrum after fixing the biases and the 
RSDs, our optimal estimators can be used to measure both 
the power spectrum, the biases, the shape of RSDs, etc. In 
particular, our estimators facilitate measurements of RSDs, 
scale-dependent bias and non-Gaussianities from cosmolog¬ 
ical surveys of multiple tracers, helping realize the potential 
for determining those physical parameters to an accuracy 


which is not limited by cosmic variance ( 

Seljak 

2009 Me- 

Donald & Seljak|20081|Gil-Marm et al.|20101 |Hamaus et al. 

2011 

Abramo &z. Leonard|2013 1. 

n the 1-halo term 

1 

We also included the contribution fror 


in our calculations (Sec. |^. Although on very large scales 
{k Ih Mpc“^) the 2-halo term is dominant, the 1- 
halo term gives a nearly-constant contribution in that limit, 
adding to shot noise — and, unlike shot noise, it does affect 
the cross-correlations. 

It is important to stress that our formulas are relatively 
simple generalizations of those by FKP (Feldman et al.[1994l 


and PVP (Percival et al. 20031, so readers familiar with these 


standard methods should have no trouble implementing the 
multi-tracer technique. We tested the estimators (see Sec. 

in a wide variety of situations, and they performed quite 
robustly — in many instances, significantly better than the 
FKP method. It should now be straightforward to combine 
cosmological surveys targeting different types of galaxies, 
quasars and other tracers of large-scale structure. 
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d'P'a/dV^i, and this Jacobian equals when g, < N — 1, it 
vanishes if a = — 1 and /r < — 1, and it is equal to 1 if 

a = N — 1 and gi,> N — 1. 

However, what we need for the new Fisher matrix is 
the Jacobian for the inverse transformatiorj^ > "P^, i.e., 
Jp,a = {dPfi/dP'a)~^■ But this turns out to be a very simple 
matrix: when < N — 1, it vanishes if a = A — 1 

and /r < A — 1, and when a = A — 1 and ^ > A — 1 the 
Jacobian is equal to P^/P(v-i = P/j/(Pjv-i +Pjv-i) — see 
also the discussion at the beginning of Sec. 

Hence, in the new variables the Fisher matrix (or, more 
precisely, the Fisher information density per unit of phase 
space volume) is: 


F'ab = F[V',,Vl]=Y.J^^ 




Pu Pv 


Jub 


(Al) 


where = F[logP^,logP^] — see Eq. (26l. This turns 
out to be given by: 


/ r'a, K 


FL = 


\ Sym 


■Ptv' 


1 _ P \ 

’(v-i \ 


(A2) 


(V 


_1 _^ -p 

^_i)2 j 


where the upper left block is an (A — 2) x (A — 2) matrix, 
the right block is an 1 x (A — 2) column, the lower left 
block is an (A — 2) x 1 row, and the lower right block is a 
single entry. Hence, the resulting (A — l)-dimensional Fisher 
matrix is given simply by summing the lines and columns 
corresponding to the two tracers which were combined into 
a single type. 

Now, it can be easily verified from Eq. (261 that sum¬ 


ming any two lines and columns of the fisher matrix Tfiv 
yields precisely the Fisher matrix where the new entries cor¬ 
respond to the Fisher information for the sum of the cluster¬ 
ing strengths of the two species that were combined. In other 
words, if we take Eq. (261 and use to compute = 


F[logPa, logPj], then the Fisher matrix FR = KbtP'aP'b is 
identical to Eq. ( |A2[ ). 

This argument can be iterated to show that combining 
any number of tracers into a single species corresponds to 
adding their clustering strengths, and this operation results 
in a simple sum of the Fisher information of those tracers. 


APPENDIX A: DEGENERATE TRACERS IN 
THE BASIS OF THE AUTO-POWER SPECTRA 

Suppose we have A types of tracers, but we would like to 
combine the last 2 of those tracers into a single species, for 
a new total of A — 1 tracers. For simplicity, let’s regard our 
original parameters as P^ (/r = 1... A). We would like to 
change variables to Pa (o = 1... A — 1), where P'a = Pa for 
a = 1,... A —2, and the new tracer species is constructed by 
combining the last two tracers, P'm-i = Pn-i + Pn- When 
the biases of the two species which are combined into one 
are identical, Bm-i = Bn -2 = Riv-i, this linear combina¬ 
tion ensures that Pj^-i = (fiN-i + nN)B'^_iPm, so clearly 
hjv-i = fiN-i + fiN — i.e., the total number of galaxies of 
the new species is the sum of the number of galaxies of the 
two original species. 

The Jacobian for the transformation P'a —>■ Pfi is 


APPENDIX B: INVERSION OF THE 
COVARIANCE MATRIX 


Consider a matrix of the form = F J- w (g) w — i.e., 

= Sfav + UfaVa- As discussed in Section 2, it can be 
shown that: 


= F - 


V (E) V 

1 J- ’ 


(Bl) 


where = Tr(u (g) w) = VaVa- This is in fact a special 


case of the Sherman-Morrison-Woodbury formula (Wood- 
bury|1950[ |. 


® In fact, since this Jacobian is not a square matrix, it only has a 
pseudo-inverse. However, in this case the pseudo-inverse is exact. 
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The matrix also has a simple “square root”, as well 
as an “inverse square root”, given by: 


- ¥+ 

i + ViT^ 

(B2) 

1 + v'^ + + v'^ 

(B3) 

where = M~^ and 

which follows that: 

= from 

• Mu ■ =¥ . 

(B4) 


Now, take a matrix M = + u (8) w- The first piece 

of that matrix can be diagonalized following the procedure 
outlined above, so we have that: 

• M ■ = F + • u) (g) (u • 

= ¥ + u' ® u , (B5) 

where u' = • u (i.e., u'^ = But the 

matrix of Eq. (B51 can now be inverted using the equiva¬ 
lent of Eq. (B11, a nd m oreover it has an inverse square root 
as in Eq. (B3l. Therefore, we have that: 




■ M- =F . 


Therefore, the inverse of the matrix M is given by: 

V u' u' ^ 

= ■ M~f ■ . 

Of course, one could equally write this inverse as: 

M-' = M-i/" • M-,^ ■ , 

where v' = Mu ■ v. 


(B6) 
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